Understanding anti-immigration sentiment spreading on Twitter

Immigration is one of the most salient topics in public debate. Social media heavily influences opinions on immigration, often sparking polarized debates and offline tensions. Studying 220,870 immigration-related tweets in the UK, we assessed the extent of polarization, key content creators and disseminators, and the speed of content dissemination. We identify a high degree of online polarization between pro and anti-immigration communities. We found that the anti-migration community is small but denser and more active than the pro-immigration community with the top 1% of users responsible for over 23% of anti-immigration tweets and 21% of retweets. We also discovered that anti-immigration content spreads also 1.66 times faster than pro-immigration messages and bots have minimal impact on content dissemination. Our findings suggest that identifying and tracking highly active users could curb anti-immigration sentiment, potentially easing social polarization and shaping broader societal attitudes toward migration.


Introduction
Public sentiment around immigration is a key societal issue.It polarizes and defines political agendas, and policies and influences social manifestations [1].Extreme social divisions on immigration harm minorities and diminish social cohesion creating conditions for violence [2], reduced labor force participation [3], and less sense of belonging among migrants [4].These frictions are exemplified by political outcomes with clear anti-immigration agendas like Brexit and the ascent of far-right parties [5] as well as the formation of pro-immigration movements like 'Refugees Welcome' [6].
Online social media networks have become a prominent space to express public opinions about migration and migrants [7].They represent a public forum where people can express their opinions openly and these can spread rapidly and globally [8].Unlike traditional methods, such as community meetings or traditional media, social media provides an easy, quick, and remotely accessible open forum where people can freely share their views on immigration.The speed and global reach of these platforms allow opinions to spread rapidly transcending geographical limitations.However, challenges arise from this dissemination of immigrationrelated content, including the spread of misinformation and the amplification of polarized views.Nevertheless, social media offers an accessible and dynamic medium for engaging in dialogue, contributing to public discourse, and raising awareness of migration-related challenges.
The existing knowledge about the dissemination of immigration sentiment through social media is currently limited.Previous research has mainly focused on how alternative digital sources, including social media data, can be utilized to track immigration sentiment [9].However, these studies have often been confined to specific events [10] or exploring monitoring techniques [11].It has been observed that social media platforms play a role in promoting anti-immigration sentiment by efficiently spreading misinformation and polarizing content related to migrants [12].Such content can influence public opinion by amplifying fears and biases against migrants [13].Additionally, social media's significance as a platform for news consumption contributes to shaping public sentiment on immigration, as it has been shown to reinforce existing biases and polarization [14].Some studies have examined the extent of polarization in online debates about immigration within particular countries [15,16], as well as the involvement of bots in disseminating anti-immigration content [17].Additionally, public claims have blamed the high level of polarization on social media, the prominent role of social media influencers, and the increasing speed of anti-immigration content as key factors fostering a contentious debate on immigration.These factors contribute to a contentious debate on immigration, underscoring the urgency of investigating the dynamics of online immigration sentiment comprehensively.
High levels of polarization can contribute to alarming levels of isolation in content consumption, reinforcing existing beliefs on immigration.By measuring the extent of polarization in the online debate on immigration, we can plan initiatives to reduce it, thereby promoting a healthier online discourse.Yet, little is known about the degree of polarization within the social media discourse on immigration.Key users on online social media can also drive polarization and online extremism by shaping the public debate.Indeed, social media has witnessed the rise of so-called 'influencers'-particularly relevant users who can steer the public debate by reaching a large number of people.This includes people who can be key sources of content by generating or spreading a disproportional amount of content.The immigration debate has seen the rise of key figures (eg.Nigel Farage during the Brexit Referendum) with a substantial power over the immigration debate.However, we have a limited knowledge on how key producers and spreaders shape the online debate on immigration on social media.
Online social media platforms are also incredibly effective in sharing information at an unprecedented pace.However, this has raised concerns due to the rapid dissemination of inaccurate, false, and violent information, particularly targeting ethnic minority groups, which may lead to physical violence [18].Automated accounts, known as bots, could enhance the dissemination of content by quickly reposting and generating partisan content, further reinforcing extremism [19].Yet, a limited amount of studies have study have measured the speed of immigration-related content on social media and the role of key users in producing and disseminating immigration-related content, including bots.
To develop a better understanding of online immigration sentiment, we need to investigate three key dimensions: the extent of the polarization of the online debate, the key sources and the speed of the immigration-related content.Prior works have examined these dimensions to understand alternative processes, but not within the immigration debate, such as investigating the speed of online misinformation [20], analyzing the polarization of political discourse in online platforms [21], and identifying the primary sources of content during elections [22].
To address these gaps, this study aims to leverage natural language processing (NLP) methods and social network science (SNA) to explore these three key dimensions: authors' metadata.This process of retrieving the full Tweet object from Twitter starting from a tweet ID is referred to as hydration.You can also use Twitter's API to retrieve the data using Rowe et al. ( 2021) code provided at this link: https://github.com/fcorowe/stigma_covid/blob/main/methods/ 01_collecting_and_processing_twitter_data/ collecting_and_processing_twitter_data.ipynb.Alternatively, there are several easy-to-use tools that have been developed for such purposes, including the Hydrator and Twarc.Data can be retrieved by using Twitter's API for researchers who meet the criteria for access to confidential data.More information on how to access the data is available at https://developer.twitter.com/en/usecases/do-research.Specifically, a researcher can either apply to get access to the data under the "EU Digital Services Act" by filling out this application form (https://forms.gle/btDwenPF7M3hgSvw7) or they can subscribe to one of the API services available at Twitter/X.Access to the data under the "EU Digital Services Act" is limited.According to the X/Twitter policy, "[access] is only for a narrow subset of EU research related to the DSA (Digital Services Act).For general academic research, please enroll in one of our X API access tiers."Twitter/X does not provide an email access to have more information and a directly reach them.
• Determine the extent of polarization of social media immigration sentiment; • Identify the key producers and spreaders of social media immigration-related content; • Measure the speed at which this content is disseminated through social media immigration communities.
Based on 220,870 tweets collected by Rowe et al. [23] for the UK, the research examines the immigration sentiment between December 1st 2019 to April 30th 2020 which includes the UK general election (December 13th 2019) and the first wave of the COVID-19 pandemic in the UK.Twitter is an alternative source of data to measure social phenomena including immigration sentiment [9], while its usage presents its own challenges including multiple biases in its userbase and data quality [24].The granularity and the scale of the data make Twitter data an excellent source to study almost in real-time the public sentiment on immigration and potentially design effective policies to halt online abuses and hate.
The rest of the paper is structured in five sections.The next section presents existing work on immigration sentiment on online social networks and patterns of the general attitudes towards immigration.It also describes the role of social media in shaping public opinions analyzing social media polarization, users' heterogeneity in producing and spreading content and the speed at which online content circulates.To this end, we draw on research on the digital online spread and discussions of other research areas including misinformation and political polarization.We then introduce the data and methods used in Section Two before presenting the results in Section Three.Next, we discuss our findings in Section Four, and we conclude Section Five by delineating the policy implications and avenues for future research of our research.

Background
Migration sentiment is a crucial social issue that influences integration, discrimination, and human rights.Positive sentiment fosters social cohesion, while negative sentiment hinders integration and perpetuates discrimination.Understanding public opinion on immigration is essential for inclusive policies including leveraging the economic potential of migration [25,26].Immigration has emerged as a significant challenge in various countries, often depicted as a major concern that influences political narratives, as exemplified by events like Brexit and the ascent of far-right parties [5].Extreme right-wing narratives have often depicted migrants as the 'scapegoats' for all the challenges within a country creating the conditions for a 'moral panic' towards migrants and immigration [27].
Immigration has also been a prominent and enduring concern for the UK population over the past two decades, consistently ranking among the top three most important issues for British voters [28].This issue has been linked to a concerning rise in racially motivated hate crimes [29].Survey polls also indicate a growing negative opinion on immigration while the significance of immigration has decreased since the EU 2016 referendum in the UK [30].
Assessing immigration has remained a complex endeavor, aimed at offering a real-time understanding of the multifaceted issues associated with immigration.Established research methods of data collection have been mostly based on time-consuming, time-sparse, expensive, and spatially coarse surveys.These methods also rely on pre-defined questions which offer insights into specific issues of the immigration process.Empirical research has mostly relied on a question asking respondents for their position in relation to raising or decreasing immigration quotas [31].The view of immigration based on these questions is less positive than if we consider other dimensions of the impact of immigration on culture and diversity [32].Alternative approaches can overcome some of the key drawbacks of well-established data sources.Social media has emerged as a unique source to track public sentiment on immigration [33].These platforms not only capture content related to immigration but also play a significant role in shaping public opinions on the subject.[27,34].Acting as a de facto public forum for global opinion exchange, social media facilitates discussions on immigration.Furthermore, it can shape the public debate on immigration through various mechanisms and features inherent to these platforms.
Firstly, social media platforms serve as a primary source of news and information for many individuals [35].Studies have shown that unfavorable news coverage about migration can reduce the acceptance of asylum seekers by imposing more restrictive policies [36].This coverage can be enhanced by social media further reinforcing this news consumption pattern [14].Indeed, the information presented on social media platforms can be fragmented and biased, leading to potential polarization and the reinforcement of existing attitudes toward migrants [16].
Secondly, social media enables the rapid spread of content, especially emotionally charged narratives.This rapid dissemination of online emotional responses can quickly trigger violent actions on migrants [12,18].Indeed, social media platforms have facilitated the rise of online activism and mobilization around migration issues.Campaigns, hashtags, and user-generated content on social media can raise awareness, shape public discourse, and influence public opinion [37][38][39][40][41]. Activist groups and organizations leverage social media to amplify their messages and mobilize support for their causes.For example, both pro-immigration movements, such as 'Refugees Welcome' and anti-immigration movements such as the 'Leave' Brexit have been facilitated by social media engagement [5,6].
Social media platforms are also known for spreading misinformation and disinformation related to migration [12].False or misleading narratives on migrants can shape public opinion by fueling fears, stereotypes, and biases [13].Research has shown that online misinformation can have a significant impact on public perceptions and attitudes towards immigration [33].Although there is an increasing number of empirical studies analyzing public opinion on migration, the research in this area remains limited.Existing research mostly focuses on exploring the potential of the use of social media to analyze public opinion on migration [33] or assessing the prevalence of positive and negative sentiment towards immigration [11,42,43].Temporally, a longer time span analysis is missing since most of the previous research has explored how immigration sentiment takes shape during notable events in Spain [10] and in the UK [44].
As a result, an enhanced understanding of the key structural elements of the online debate on immigration is required.This includes identifying the extent of the network polarization, the size of a network, the key actors involved in the debate and how content disseminates.By analyzing these elements, we can gain insights into the dynamics of information flow, social interactions, and the formation of sentiment.The polarization in the online debate on immigration has received limited research attention despite its societal significance.Polarization in the online immigration sentiment has been analyzed in Italy [16] and Russia [15].Vilella et al. [16] showed that the Italian immigration debate on Twitter is highly polarized with low level of interactions between communities on the opposite side of the spectrum.On the other hand, Ziems et al. [45] have shown how users involved in anti-Asian hate and counterhate online speech during COVID-19 in the US tend to have interconnected interactions without confining themselves to isolated groups.
The online public discourse on immigration can be significantly influenced by a few influential users.Yet, we have limited knowledge on how key users could shape the online debate on immigration.The speed at which different immigration-related content spreads within a network is also relevant.Empirical evidence shows that the quick spread of online misinformation about migrants can rapidly lead to physical violence [12].Yet no study has investigated the pace at which immigration content spreads online and assessed the differences between anti and pro-immigration content.
Bots could play an active role in shaping the online debate on immigration by both acting as primary sources in a network of users and accelerating the pace of content dissemination.Indeed, online social networks have also witnessed the emergence of non-human users otherwise known as 'bots' which can increase polarization and enhance the spread of misinformation [19].Bastos and Mercea [17] have studied the role of bots in spreading pro-Brexit campaign messages highlighting how bots can be used to leverage quasi-fake news mostly around immigration.However, the specific impact of bots on the spread of immigrationrelated content has yet to be tested.
While the existing literature on immigration lacks the study of these key dimensions on social media, previous studies have explored these characteristics across multiple research areas including the online debate on politics [21], COVID-19 [46], climate change [47] as well as misinformation [20], fake news [22] and conspiracy theories [48].Social media have been publicly blamed for increasing societal polarization by fueling divisions through the creation of 'echo chambers' or 'filter bubbles' which reinforce existing ideas and biases by carefully excluding different voices from a certain debate [49,50].On the other hand, other research suggests quite the opposite.Social media platforms expose users to a multiplicity of perspectives, including those from the opposite spectrum, thereby instigating negative and hateful interactions [51,52] which tend to become increasingly toxic as the communities involved are more polarized [53].This has been empirically observed in the immigration debate in the UK, a practice also known as 'ratioing' or 'boo and cheer' in which strong pro-immigration messages provoke an even stronger anti-immigration response [54].
The online debate on immigration can also be vastly shaped by few key actors and users.Previous studies have shown that the so-called 'influencers' have a substantial role in fostering extremism in the user base rather than the content per se [55].Similarly, [22] discovered that 0.1% of the users shared 80% of the fake news on Twitter during the 2016 US presidential election.Twitter itself has used this approach to highlight the disproportionate contribution of users across the platform [56].The pace at which online content on immigration spreads could also vastly change the patterns in the debate as well as trigger uncontrolled offline reactions [12].Research on misinformation has shown falsehood content spreads faster [20] while conspiracy-based content seems to spread slower than science-based information [48].Evidence suggests that hateful content has higher speed and higher spread on social media platforms [57].
The impact of the bots on the spread of online content.Regarding the relationship between the speed of Twitter content and non-human users, Shao and Ciampaglia [58] discovered a positive correlation, although the significance of this finding has been challenged by other researchers [20].Nonetheless, bots appear to have an amplifying role in disseminating divisive and hateful content [59,60] and reshaping the dynamics of the debate [61,62].

Data
The research leverages Twitter data to analyze online public sentiment toward immigration.The data were collected by Rowe et al. [23] across five countries (Germany, Italy, Spain, the UK, USA) between December 1st, 2019 and April 30th, 2020 for a total of 30.39 million data points.This study uses only the UK data, resulting in a total of 220,870 tweets.Rowe et al. [9] used a methodology to collect a curated sample of tweets leveraging the Premium Twitter API.
Each day, 500 tweets across three different query types (hashtag, account and key terms) were obtained for a total of 1500 tweets per day.Data were processed by Rowe et al. [23]: (1) removing duplicated retweets, and non-relevant migration-related tweets (e.g.concerning bird migrations); (2) converting emojis and hashtags into the text; and, (3) removing account usernames, URLs, and hyperlinks.The Twitter data set was released in compliance with the Twitter's Terms and Conditions, under which Rowe et al. were unable to publicly release the text of the collected tweets, except the tweet IDs, which are unique identifiers tied to specific tweets.

Methods
Our analysis involves five stages.Firstly, we classified each tweet based on their standing towards immigration using a fine-tuned BERT transformer.Secondly, we label the users as being pro-or anti-immigration based on the proportion of anti-or pro-immigration tweets shared.Thirdly, we use social network science (SNA) methods to understand the structure of the users involved in the Twitter debate on migration.Specifically, we measure the strength of the polarization in the debate, the density of the pro-and anti-immigration users' networks and quantify the strength of the relationship between spreaders as well as producers of antiand pro-immigration content.Fourth, we identified who are the key spreaders and producers of anti and pro-immigration content by calculating the top 1% by the number of retweets to identify the spreaders and the top 1% by the number of tweets generated to identify the producers.Fifth, we wanted to understand the differences in speed across types of content.Thus, we calculated two complementary cumulative distribution functions (CCDFs) to assess the difference in the number of retweets between anti-and pro-immigration content.We also calculated the median time for an anti-and pro-immigration tweet to reach a certain number of retweets.Sixth, we identified bots i.e. non-human users within the network to understand what their impact on content dissemination is.Next, each stage is described in detail.

Text classification.
We built a text classifier to identify the different standings toward immigration in our Twitter dataset.A text classifier uses artificial intelligence and machine learning to automatically identify different types of content processing large amounts of data with speed and accuracy [63].The research uses a pre-trained Bidirectional Encoder Representations from Transformers (BERT) to build a custom-made text classifier that has been fine-tuned on a random subset of manually labeled data.
BERT is a deep learning architecture developed by researchers at Google [64] that is pretrained on a large unlabeled text corpus.BERT constitutes a new class of methods in natural language processing which have also been known as transformers.Transformers constitute the current 'state of the art' in the NLP methods [65] which are the core architecture of large language models.Transformers have been previously used to detect hate speech, outperforming previous methods [66][67][68].Indeed, transformers can be tuned for specific tasks i.e. immigration sentiment detection [45].
To fine-tune the transformer, we randomly sampled 1000 tweets from our dataset.We manually labeled them using four exclusive categories: 'pro-immigration', 'anti-immigration', 'neutral', and 'unclassified': 'pro-immigration' labels are used to describe the content of tweets when they express positive opinions towards immigration; 'anti-immigration' labels are used to describe tweets expressing negative opinions towards immigration; 'neutral' labels are used to identify content with no clear sentiment towards immigration; and, 'unclassified' labels are used to categorize content which cannot be classified into one of the categories above, either because the tweets are not completely related to the immigration debate (e.g."Question: Has anyone successfully attempted a Windows vCenter 6.5 (with embedded PSC) to vCSA 6.5 migration?/cc @kev_johnson") or cannot be assessed in the context (e.g."@RSPCA_official @ukhomeoffice Maybe @EventbriteGB could assist?").The training dataset is unbalanced across the four categories thus we oversampled tweets labeled 'neutral' and 'anti-immigration' and undersampled tweets labeled as 'unclassified' in the final training dataset.This dataset is then used to fine-tune our classifier.The optimal learning rate is iteratively determined to optimize accuracy and minimize the loss and it is visually selected by analyzing the auto-generated loss plot through a built-in ktrain library function.The epochs, batch size, the max level of tokenization and the max number of features are all selected through an iterative process building multiple classifiers and progressively selecting the best performers by looking at the F1 scores across variables and the general AUC-ROC score.
3.2.2User classification.We classified the users based on their tweets shared.A user is labeled as being pro-immigration if more than 50% of her tweets are pro-immigration, while the opposite holds true for anti-immigration.A boolean numeric variable is assigned to all tweets labeled either anti-(0) or pro-immigration (1).Each user is thus identified as being pro-immigration or anti-immigration by calculating the average of the newly assigned boolean variable.

Network analysis.
We employed Social Network Analysis (SNA) to examine the structure of the Twitter discourse regarding migration.In this analysis, we designated Twitter users as nodes and represented their interactions through retweets as edges.Nodes in this context serve as both producers and spreaders of content, with edges indicating the flow of content from one node to another, establishing a directed network.Each node has a degree that quantifies the number of edges associated with that node.A higher node degree signifies a higher number of retweets.Within our directed network, we further distinguish nodes based on their in-degree and out-degree values.In our study, a node with a high in-degree signifies a user that frequently retweets, essentially a prominent content spreader.Conversely, a node with a high out-degree indicates a user who generates content that is regularly and widely retweeted, essentially a prolific content producer.The resulting directed network comprises 34,063 nodes and 48,883 edges.Additionally, we identified two distinct subnetworks: the pro-immigration network and the anti-immigration network.These subnetworks were discerned by evaluating users' stances on immigration.
We assessed the network's structure, namely the polarization in the network, the density of the anti-and pro-immigration networks and the strength of the interactions between producers as well as spreaders in the anti-and pro-immigration network.We computed four metrics: the attribute assortativity coefficient, in-degree assortativity coefficient, out-degree assortativity coefficient and edge density.
We calculated the attribute assortativity coefficient as: where e ii is the probability of an edge (a retweet) between two nodes (users) which have both a given standing towards immigration, a i is the probability that an edge has as origin a node with given standing towards immigration and b i is the probability that an edge has as destination a node with value i.
We also calculated the in-degree (out-degree) assortativity coefficient as: where P i P j (e ij − a i b j ) calculates the product of node properties and their differences for each pair of nodes (users) with specific in-degree values.The summation is the sum of these products across all pairs of nodes with their respective in-degree properties.σ a σ b in the denominator normalizes the result, dividing by the product of the standard deviations of the distributions of in-degrees.Symmetrically, the same equation calculates the out-degree assortativity coefficient but only including out-degree edges.Finally, we calculated the edge density: where m is the number of edges (retweets) in the users' network and n is the number of nodes (users).The edge density measures the level of interconnectedness within the network based on the retweets.The assortativity coefficient is a measure to assess homophily in a network.
Homophily captures the tendency of individuals to form connections sharing similar attributes, such as opinions or beliefs.In the context of polarization, attribute assortativity can be used to measure the tendency of nodes in a network to connect with others sharing similar opinions or beliefs like standing towards immigration.The more assortative in a network, the more polarized the network is.The attribute assortativity coefficient is calculated as the Pearson correlation coefficient by comparing the observed connections between nodes with a similar standing towards immigration to what would be expected in a random network with the same degree distribution.We also want to understand how spreaders and producers engage within the debate.We calculate the in-degree and out-degree assortativity coefficients.The in-degree assortativity coefficient measures the tendency of nodes to connect with other nodes with a similar number of incoming connections i.e. number of retweets shared.Similarly, the out-degree assortativity coefficient measures the tendency of nodes to connect with users having a similar number of outgoing connections i.e. the number of retweets received.The three assortativity measures were calculated as Pearson correlation coefficients degree between pairs of linked nodes ranging from -1 to 1.An assortative coefficient closer to 1 indicates a perfectly assortative network.On the other hand, a coefficient of -1 indicates disassortative networks.
Edge density measures how interconnected are the users within a network.It is calculated by dividing the number of edges present in the network by the total number of potential edges that could exist between its nodes.This metric is often used to assess the sparsity of a network.We considered a network with low edge density to be sparse, indicating that users are not extensively connected through tweets and retweets, but rather they are relatively isolated.On the other hand, a network with high density has more homogenous interactions signaling more engagement within a community.

Key sources.
We defined key producers and spreaders of Twitter content.A producer is defined as a user generating Twitter content.A spreader is a user who shares (retweets) someone else's content.Users can be both a producer and spreader of content.To define key producers and spreaders, we first identified the number of users generating content (i.e.producers) and the number of users sharing content (i.e.spreaders) and selected the top 1% of users in these two groups by both the count of tweets and retweets.We carried out an analysis of the networks of users with pro and anti-migration stances and compared this to the remaining 99% of users in each group.

Content speed and cascade analysis.
We built tweet cascades to analyze the speed and reach of migration-related content in our user networks.A tweet cascade is generated using information of a tweet and subsequent retweets thus capturing the history of a tweet and its dissemination on Twitter.Using tweet cascades, we can determine the pace of dissemination for a tweet in the network measuring the time taken to reach a given number of retweets.
The size of a tweet cascade can also be measured to capture its reach; that is, the number of times an original tweet was retweeted.To visually represent tweet cascades, we used a complementary cumulative distribution function (CCDF).The CCDF helps to show the fraction of users (either pro or anti-immigration) with a certain number of retweets.
By displaying the CCDF on a chart, we can graphically illustrate the distribution of retweet counts by showing the probability of observing a certain number of retweets or more by type of content i.e. anti-or pro-immigration.The y-axis with CCDF probability is logged given the skewed distribution of the cascades.Bots are subsequently removed from the data and the CCDFs are recalculated to understand their impact.

Bot analysis.
We also identified non-human actors i.e. bots and how our results are impacted by the content generated by those.Intuitively, we identified and removed accounts as bots to assess how and if our results changed in any statistically significant way.We employed Birdspotter, a Python library created by Ram et al. [69], to identify bots.Birdspotter processes raw JSON tweet data and generates a 'botness' score, which helps determine the likelihood of an account being a bot.We identified users as bots if they have a botness probability higher than 90%.

Ethical standards
The research meets all ethical guidelines, including adherence to the legal requirements of the study country.Ethical approval for the project was granted by the University of Liverpool Research Ethics Committee (ref: 7654).

Text classifier results and performance
The accuracy of the text classifier changed across the different labels.F1 scores for 'Unclassified' and 'Neutral' tweets were respectively 0.73 and 0.75 while for the 'Anti-immigration' and 'Pro-immigration' labels 0.55 and 0.64.Table 1 shows the tweets in the dataset as labeled by the text classifier.Most of the tweet content in our sample was classified as 'Unclassified' (36.1%), followed by 'pro-migration' tweets (27.9%) and anti-migration tweets (19.7%).A minority of the tweets are considered neutral (16.1%).network (in blue), the anti-migration network (in red) is denser and smaller in size.An attribute assortativity coefficient of 0.81 suggests that the level of polarization between these users' networks is high.Users with a similar standing towards migration tend to exclusively retweet content produced by users with similar migration sentiment.Additionally, the anti-immigration network is 2.8x denser than the pro-immigration network by measuring the edge density.A higher network density within the anti-immigration network indicates a strong level of connectivity and engagement among its users.

The structure of the users' network
The type of engagement also changes across the two communities and the type of users, respectively producers and spreaders.A positive in-degree assortativity coefficient suggests that users who frequently retweet content are inclined to interact with other similarly active retweeters, and vice versa.The anti-immigration community has a higher in-degree coefficient (0.31) compared to the pro-immigration community (0.06).This indicates a distinctive pattern in the anti-immigration community compared to the pro-immigration community.Conversely, both communities exhibit out-degree assortativity coefficients close to 0, indicating a lack of a clear assortative or disassortative network structure in terms of outdegrees.

Key producers of the anti (pro) immigration content
Table 2 shows key metrics to assess the role of the top 1% of users by the number of tweets published (excluding retweets) compared to the bottom 99%.Table 2 also reports the top 1% of producers by their stance on migration.The top 1% of producers account for a disproportionate amount of content shared, particularly among anti-immigration users.The table reveals that key producers across both stances create on average 18.9x more tweets than the remaining 99% of users.The leading producers of anti-immigration content create 23.18% of the total tweets against immigration.In contrast, the top 1% producers of pro-immigration content account for a total of 11.69% of the total tweets supporting immigration.We also analyzed the presence of bots in the discussion of migration on Twitter.Our results suggest that only a small proportion of top producers are bots (1.5%) across both anti-and pro-immigration users.On the other hand, a higher share (4.12%) of the 99% of producers are identified as bots.

Key spreaders of the anti (pro) immigration content
Table 3 shows key metrics for the top 1% spreaders by the number of retweets compared to the remainder 99%.It reports the top 1% of users by migration sentiment.Similarly to the key producers, the results reveal that the top 1% of spreaders retweet 12.6 times more, representing 12.11% of the total retweets.The anti-immigration users represent a majority (70.08%) of the total number of top spreaders.These users generate 21.36% of the total anti-immigration retweets, while key pro-immigration spreaders retweet 6.01% of the total positive retweets on immigration.The top spreaders have no users classified as bots.On the other hand, a larger share of the bottom 99% of the users are bots, namely 0.92%.
Considering both key spreaders and producers of content in our data, we can identify a sort of 'super users' which are both key spreaders and producers of content.Around 7.28% of the total top 1% producers are also top 1% spreaders.Among these 'super users' (total count = 18), 72.2% are anti-immigration users, further suggesting how the public debate on Twitter around immigration could be disproportionately influenced by a tiny group of people on the platform.The results show that anti-immigration content spread, on average, 1.66 times faster than positive pro-immigration tweets.They also revealed that the speed of diffusion differs across the size of tweet cascades.For smaller tweet cascades (<15), there is no difference in speed but for mid-size cascades (15-120), anti-immigration speech spreads consistently faster than pro-immigration tweets.Large cascades (>120) have a similar trend as the mid-range cascades.For some intervals (i.e.around 150), the data is more sparse and less continuous displaying sudden fluctuations.Bots do not seem to exert a major influence on these results.Removing users classified as bots decreases the median speed of anti-immigration content sharing by 12%, but it does not significantly alter the patterns identified above.A) Retweets including all users.B) Retweets without users with a botness probability higher than 90%.

Tweet cascades
Fig 3 displays the CCDFs of pro-and anti-immigration cascades.This informs us how likely is anti-or pro-immigration content to spread by a number of retweets.The higher the value on the y-axis, the higher the probability that an anti-or pro-immigration tweet reaches a certain cascade size i.e. number of retweets.The x-axis of the graph represents the cascade size, which is the number of retweets for a particular tweet related to immigration.The y-axis, on the other hand, represents the complementary cumulative distribution, indicating the probability that a cascade will have at least a certain number of retweets.
In This suggests that anti-immigration content is, on average, more likely to be retweeted, indicating a stronger capacity to spread effectively in the Twittersphere compared to proimmigration content.When we consider the inclusion of bots in Fig 3B, the observed difference is somewhat diminished, but their impact does not appear particularly significant.It suggests that while bots do play a role, they don't substantially alter the dynamics of anti- immigration content dissemination.However, it is important to note that when examining cascade sizes exceeding 120, the observed patterns become less clear and consistent in both figures.This could be due to various factors influencing the behavior and reach of immigrationrelated content in the online environment.

Key results
Immigration is a significant and divisive topic causing political frictions and social tensions.Social media can enhance these conflicts around immigration.Yet, there is limited quantitative evidence that has used online social networks to investigate the features and dissemination of online discussions about immigration in the UK.Our findings show how the public debate on Twitter around immigration in the UK is largely polarized.The anti-migration network is smaller in size compared to the pro-immigration community, but it is denser suggesting that anti-migration users tend to be more engaged within the community.In the anti-immigration community, users who often retweet are more inclined to engage with fellow frequent retweeters, unlike the pro-immigration network.
This finding reinforces the possible positive relationship between user engagement and polarization and could make the anti-migration network more efficient in spreading content internally.A small group of influential users, especially within the anti-immigration network, is responsible for a significant portion of the production and dissemination of polarizing tweets about immigration.Our data shows that only 1% of the producers account for 16% of the total tweets in our dataset.This trend is more pronounced among the anti-immigration community, where slightly over 100 individuals generate 23.18% of the total tweets opposing immigration.
Likewise, the top 1% of the spreaders are responsible for 12% of the total retweets, and within the anti-immigration community, the top 1% of spreaders account for 21.36% of the total anti-immigration retweets, far more than the key pro-immigration users.The research reveals that tweets with negative sentiment towards immigration spread 1.66 times faster than positive content.The size of the anti-immigration tweet cascades is systematically larger than the pro-immigration tweets suggesting a higher engagement of anti-immigration messages across time.

Implications
The extent of the polarization in the online public debate on immigration-related issues in the UK could enhance online violence [70] which can ultimately trickle down to physical actions towards migrants and minorities [18].Yet, a casual-effect relationship between online antimigration sentiment and racially motivated physical crimes still needs to be established.Our findings show the existence of polarized communities on both pro and anti-immigration stances which can inform existing policymakers to design tools to mitigate polarization on both sides.Indeed, new evidence suggests that online extreme views might feed into each other [51,52], yet future studies should explore this thesis in the context of the immigration debate.The content within these polarized communities is largely generated by a small share of the total active users, especially in the anti-immigration community.
Our finding is consistent with previous research on politics [22] and the COVID-19 antivaccine debate [71] yet it was not proven for the immigration debate.This implies that a small but determined group of people can vastly affect the online public sentiment on immigration.The significance of this discovery lies in its policy implications, as it has the potential to shift content moderation efforts from monitoring a large cohort of users to a smaller but over-influential subset.This is particularly relevant given the ongoing challenges with content moderation policies on major social media platforms [34].Lastly, this novel insight contributes to a broader literature on how users engage in the dissemination of hateful content online.
The study highlights the rapid spread of anti-immigration content compared to proimmigration content.Our findings underline the urgent need to take swift action to curb online abuses, particularly during specific events that historically trigger the spread of hate speech [72].Additionally, given the relevance of immigration among the UK public, antiimmigration propaganda has the potential to significantly influence political outcomes.Previous studies have suggested that the dissemination of highly partisan and misleading content may have affected last-minute electoral decisions [73], particularly in highly contested areas [74].In the context of health pandemics, the speed at which content spreads can significantly impede public efforts to contain them, emphasizing the importance of mitigating the rapid spread of anti-immigration content [75].
Our research underscores that bots have a limited influence on the discourse surrounding immigration online, indicating minimal effects on content generation, dissemination, and velocity.This echoes the insights uncovered by previous researchers [20] regarding the role of bots in expediting the dissemination of misinformation.
The research also expands the existing broader literature on content dissemination and polarization on social media debates.The findings suggest a reinforcing mechanism between higher levels of polarization, higher user engagement, stronger role of influencers and higher content speed on social media.Notably, the anti-immigration community exhibits strong polarization, higher user engagement, faster content dissemination, and greater influence from key content sources compared to the pro-immigration community.This could also suggest a potential coordination in quickly spreading anti-immigration propaganda across an online community.Further studies should explore more in depth how these findings are interconnected and potentially having a casual effect among them.These findings align with previous studies indicating that social media platforms, by prioritizing user engagement, can foster polarization and conflicts [76] although reducing exposure to like-minded content does not decrease polarization in beliefs [77].Policymakers should carefully evaluate how the attention-seeking mechanisms of online social networks impact harmful actions towards migrants.Further research is needed to establish a causal relationship between polarization, content engagement, and the viral nature of immigration-related content.Content moderation policies should prioritize accuracy, particularly in addressing antiimmigration speech driven by misinformation and misperception, especially during health emergencies like the COVID-19 pandemic.

Limitations and challenges
The research presents a series of challenges and limitations as well.Twitter data represent a small portion of the broader online discussions on immigration given the total userbase on the platform.Furthermore, Twitter's userbase is generally younger and wealthier compared to the general UK population thus it is not representative [78].A potential solution might be to include data from other social media platforms but access to actionable users' data has proven to be challenging [79,80].Furthermore, Blank and Lutz [78] argue that no social media platform is representative of the overall UK population.Using weights to increase the representativeness of the data is a valid option to explore in future studies.Extending the timeframe of the research would benefit the generalization of the findings.The data collection process, as highlighted by Rowe et al. [23], present some challenges since it is based on specific search terms which capture a share of the whole debate on immigration in the UK.
Methodologically, the text classifier can be improved especially for labeling anti-immigration content.The current classifier might lead to an underestimate of the anti (pro) immigration tweets since it might falsely label antagonistic tweets towards immigration as positive, neutral, or unclassified.The false positives generated by wrong labeling might bias other results on speed, key producers and key spreaders.Identifying bots can also be challenging and thus it can impact the estimates of the bots' role in speed and key sources of immigrationrelated content.Rauchfleisch and Kaiser [81] studied several limitations around the use of machine-learning approaches to discover bots.A major concern is that machine learning methods could be trained on potentially very different datasets than the one on which they are later used.A better approach should be to train the Birdspotter algorithm with a customlabeled dataset as done by Cresci et al. [82] and Echeverrı ´a et al. [83].
We measured the degree of the polarization in the Twitter debate on immigration however our study does not identify key methods on how to reduce polarization.Future studies should look at understanding what are the drivers of this polarization more in depth.

Conclusion
Online social media have been widely blamed for raising tensions on the debates on immigration both online and offline, often fueled by active online anti-immigration communities.The escalation of online polarization, the considerable influence held by key influencers, and the rapid dissemination of anti-immigration content have been observed anecdotally as significant factors contributing to anti-immigration debate on social media.Yet, no studies have quantified the extent of this polarization, identified the key sources, and assessed the speed of pro and anti-immigration content dissemination.In this paper, we investigate the strength of polarization, pinpoint the primary producers and disseminators of content, and analyze the pace of anti and pro-immigration content propagation on Twitter in the UK from December 2019 to May 2020.
We presented evidence of existing polarization in the online public debate on immigration with a high assortativity coefficient (0.81).We also identified anti-and pro-immigration networks on Twitter.The pro-immigration network was found to be 1.69 times larger than the anti-immigration network, although the latter is 2.8 times denser.We also identified the primary generators and disseminators of both anti-and pro-immigration content.Our findings indicated that only 1% of producers and disseminators disproportionately generate respectively 16% and 12% of the total content, especially within the anti-immigration community.Less than 1% of the total key producers and spreaders of content were identified as bots.Our research also revealed that anti-immigration content spreads 1.66 times quicker than proimmigration content on Twitter with bots playing a marginal role in changing the pace of the content.
The findings painted a concerning picture of the online discourse surrounding immigration.There is a significant level of polarization, as online communities largely engage within their own echo chambers, fostering a sense of isolation and reinforcing existing beliefs.This circumstance has the potential to further solidify existing perspectives and, at its worst, push current users toward more extreme positions on immigration.
The higher density of the anti-immigration network indicates that although more individuals have positive views about immigration, those against have stronger connections within their online community.Within these communities, our findings suggested that by identifying and monitoring highly active users, strategic interventions would potentially achieve significant reductions in online hate content as 1% of the key nodes in the network produce 23% of the anti-immigration content in the UK.
Our findings also displayed that anti-immigration content spread faster than pro-immigration content emphasizing the need for systematic tools to avoid the widespread dissemination of harmful messages.We also showed that bots have a marginal role in the online debate on immigration with no significant impact on the production, diffusion and speed of content, similarly to what Vosoughi et al. [20] have found analyzing misinformation.The research is the first of its kind to measure the difference in speed between positive and negative messages on immigration on social media and to assess who are the key sources of content and what's their contribution on the online debate.
Our study expands our understanding on how to promote a healthier debate on immigration on social media, a crucial aspect in shaping immigration policies and a key determining factor in political elections.Failure to address online anti-immigrant sentiment can have serious consequences, including physical harm to those targeted by prejudice.Therefore, there is a critical need to implement tools that can quickly and effectively prevent online anti-immigration speech on a vast scale.

Fig 1
Fig 1 shows our retweet user networks classified as their stances towards migration.Each node has a size proportional to its degree i.e. the number of retweets it has.Fig 1 reveals the existence of two distinct communities with limited interaction between them.The pro-immigration network is 1.69 times larger than the anti-immigration network.Compared to the pro-migration

Fig 1 .
Fig 1. Users network and polarization.Retweets directed network of anti-migration (in red) and pro-migration (in blue).Each node is a user and edges are retweets between a source (user creating the original tweet) and a target (user retweeting).The size of the node is proportional to the number of degrees (both in and out) each node has.https://doi.org/10.1371/journal.pone.0307917.g001

Fig 2
Fig 2 displays how fast anti-and pro-immigration tweet cascades reach a certain level measured in median minutes.Fig 2A reports the difference in speed between anti-and pro-immigration retweets across all users.Fig 2B reportsthis difference excluding users with a botness probability higher than 90%.The results show that anti-immigration content spread, on average, 1.66 times faster than positive pro-immigration tweets.They also revealed that the speed of diffusion differs across the size of tweet cascades.For smaller tweet cascades (<15), there is no difference in speed but for mid-size cascades (15-120), anti-immigration speech spreads consistently faster than pro-immigration tweets.Large cascades (>120) have a similar trend as the mid-range cascades.For some intervals (i.e.around 150), the data is more sparse and less continuous displaying sudden fluctuations.Bots do not seem to exert a major influence on these results.Removing users classified as bots decreases the median speed of anti-immigration content sharing by 12%, but it does not significantly alter the patterns identified above.

Fig 2 .
Fig 2. Speed of pro and anti-immigration tweets.Cumulative count of retweets in median minutes on a log-log scale.A) Retweets including all users.B) Retweets without users with a botness probability higher than 90%.

Fig 3 ,
Fig 3 displays  the CCDFs of pro-and anti-immigration cascades.This informs us how likely is anti-or pro-immigration content to spread by a number of retweets.The higher the value on the y-axis, the higher the probability that an anti-or pro-immigration tweet reaches a certain cascade size i.e. number of retweets.The x-axis of the graph represents the cascade size, which is the number of retweets for a particular tweet related to immigration.The y-axis, on the other hand, represents the complementary cumulative distribution, indicating the probability that a cascade will have at least a certain number of retweets.In Fig 3, the blue curve represents the CCDF of pro-immigration cascades.Conversely, the red curve portrays the CCDF of anti-immigration cascades.In Fig 3A, we present the two CCDFs based on the complete dataset.In Fig 3A, the same CCDFs are depicted, but with the exclusion of bot users.This comparison reveals interesting patterns in the way anti-immigration tweets circulate among Twitter users.The findings show that, on the whole, anti-immigration tweets tend to have a broader reach across the Twitter user base, especially for cascade sizes smaller than 10.This suggests that anti-immigration content is, on average, more likely to be retweeted, indicating a stronger capacity to spread effectively in the Twittersphere compared to proimmigration content.When we consider the inclusion of bots in Fig3B, the observed difference is somewhat diminished, but their impact does not appear particularly significant.It suggests that while bots do play a role, they don't substantially alter the dynamics of anti-

Table 2 . Producers grouped by quantile and immigration sentiment in the top 1% of the users.
Note:The top 1% of producers are selected according to the total number of tweets.Within this top 1%, we identify users classified as 'Anti-immigration' and 'Proimmigration.'Users' and 'Tweets' columns are counts.'Tweets by user' is the total count of tweets by category divided by the number of users.'Botness' is the mean value across the users within that category.https://doi.org/10.1371/journal.pone.0307917.t002

Table 3 . Spreaders grouped by quantile and immigration sentiment in the top 1% of the users.
Note:The top 1% spreaders are selected according to the total number of retweets.Within this top 1%, we identify users classified as 'Anti-immigration' and 'Proimmigration.'Users' and 'Retweets' columns are counts.'Retweets by user' is the total count of tweets by category divided by the number of users.'Botness' is the mean value across the users within that category.https://doi.org/10.1371/journal.pone.0307917.t003